Prediction of highly cited papers
نویسنده
چکیده
In an article in the pages of this journal five years ago, we described a method for predicting which scientific papers will be highly cited in the future, even if they are currently not highly cited. Applying the method to real citation data we made predictions about papers we believed would end up being well cited. Here we revisit those predictions, five years on, to see how well we did. Among the over 2000 papers in our original data set, we examine the fifty that, by the measures of our previous study, were predicted to do best and we find that they have indeed received substantially more citations in the intervening years than other papers, even after controlling for the number of prior citations. On average these top fifty papers have received 23 times as many citations in the last five years as the average paper in the data set as a whole, and 15 times as many as the average paper in a randomly drawn control group that started out with the same number of citations. Applying our prediction technique to current data, we also make new predictions of papers that we believe will be well cited in the next few years. editor’s choice Copyright c © EPLA, 2014 Introduction. – Citations of scientific papers are considered to be an indicator of papers’ importance and relevance, and a simple count of the number of citations a paper receives is often used as a gauge of its impact. However, it is also widely believed that citations are affected by factors besides pure scientific content, including the journal a paper appears in, author name recognition, and social effects [1,2]. One substantial and well-documented effect is the so-called cumulative advantage or preferential attachment bias, under which papers that have received many citations in the past are expected to receive more in the future, independent of content. A simple mathematical model of this effect was proposed by Price [3], building on earlier work by Yule [4] and Simon [5], in which paper content is ignored completely and citation is determined solely by preferential attachment plus stochastic effects. Within this model, the expected number of citations a paper receives is a function only of its date of publication, measured from the start of the topic or body of literature in which the paper falls, and shows a strong “first-mover effect” under which the first-published papers are expected to receive many more citations on average than those that come after them. Indeed the variation in citation number as a function of publication date is normally far wider than the stochastic variation among papers published at the same time. In a previous paper [6] we compared the predictions of Price’s model against citation data for papers from several fields and found good agreement in some, though not all, cases. This suggests that pure citation numbers may not be a good indicator of paper impact, since much of their variation can be predicted from publication dates, without reference to paper content. Instead, therefore, we proposed an alternative measure of impact. We proposed that rather than looking for papers with high total citation counts, we should look for papers with counts higher than expected given their date of publication. Since publication date is measured from the start of a field or topic, and since different topics have different start dates, one should only use this method to compare papers within topics. The appropriate calculation is to count the citations a paper has received and compare that figure to the counts for other papers on the same topic that were published around the same time. In our work we used a simple z-score to perform the comparison: we calculate the mean number of citations and its standard deviation for papers published in a window close to the date of a paper of interest, then calculate the number of standard deviations by which that paper’s citation
منابع مشابه
مقالههای بینالمللی پراستناد علوم پزشکی کشور در پایگاه اسکوپوس: 2010 تا 2014
Introduction: Scientific output of the Islamic Republic of Iran in Medical Sciences has been increased during recent years as reflected in Scopus database. Moreover, highly cited papers of researchers, institutions and countries can be used in order to study the citation impact and quality of scientific output. The current study investigates the quality of Medical Sciences’ scholarly outp...
متن کاملچهار دهه فعالیت علمی ایران از منظر مقالات همایشها، مقالات پر استناد و داغ و مقالات دسترسی آزاد با نگاهی به قانون برنامه توسعه اقتصادی ، اجتماعی، فرهنگی کشور
This study aims to investigate Iran scientific production Pre-revolutionary by 2016 with the emphasis on the conferences proceedings, highly cited and hot papers, and open access papers, in the light of the Law of Economic, Social, and Cultural Development Plan of Iran. Descriptive – analytical method used. To achieve research objectives data extracted from Clarivate Analytics (Thomson Reuters)...
متن کاملDo Scientific Advancements Lean on the Shoulders of Giants? A Bibliometric Investigation of the Ortega Hypothesis
BACKGROUND In contrast to Newton's well-known aphorism that he had been able "to see further only by standing on the shoulders of giants," one attributes to the Spanish philosopher Ortega y Gasset the hypothesis saying that top-level research cannot be successful without a mass of medium researchers on which the top rests comparable to an iceberg. METHODOLOGY/PRINCIPAL FINDINGS The Ortega hyp...
متن کاملHighly-cited papers in software engineering: The top-100
Context: According to the search reported in this paper, as of this writing (May 2015), a very large number of papers (more than 70,000) have been published in the area of Software Engineering (SE) since its inception in 1968. Citations are crucial in any research area to position the work and to build on the work of others. Identification and characterization of highly-cited papers are common ...
متن کاملA Bibliometric Analysis of Highly Cited and General Papers in the Computer Science Field
Computer Science, with its rapid development, has been receiving considerable attention and support from Taiwan’s government. Because of its active performance on the international stage, the authors consider Computer Science a suitable research field to investigate issues regarding highly-cited papers. As the literature review reveals, most studies about highly-cited papers covered the applica...
متن کاملUpdate on the Most-Cited Papers in the SCI, 1955-1986. Part 2. Sixty Years of Research, from Insecticides to AIDS
In the first part of this essay, we presented a groupof100 highly cited papers from the Science Citation Irrdex@ (SCP ), 19551986.1 That essay, as we mentioned, expands and updates our previous series on the 1,000 most-cited papers in the SCf, 19611982,2 and an essay on 250 highly cited papers from 1955 to 1964.3 All of these papers are by definition Citation Classicsa. In this essay, we’ll dis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1310.8220 شماره
صفحات -
تاریخ انتشار 2013